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ABSTRACT 

One of the best ways of spotting previously undetected systematic errors in CMB experiments is to 
compare two independent observations of the same region. We derive a set of tools for comparing and 
combining CMB data sets, applicable also in the common case where the two have different resolution or 
beam shape and therefore do not measure the same signal. We present a consistency test that is better 
than a x 2 -test at detecting systematic errors. We show how two maps of different angular resolution 
can be combined without smoothing the higher resolution down to the lower one, and generalize this to 
arbitrary beam configurations. We also show how lossless foreground removal can be performed even 
for foreground models involving scale dependence, latitude dependence and spectral index variations in 
combination. 

Subject headings: cosmic microwave background — methods: data analysis 



1. INTRODUCTION 

Undetected systematic errors are one of the main obsta- 
cles to using cosmic microwave background (CMB) mea- 
surements to constrain cosmological models. One of the 
best ways to address this problem is to compare experi- 
ments whose data sets overlap both in sky coverage and 
angular scale, to see whether they are consistent. 1 If they 
are consistent, a useful second step is to combine them 
into a single sky map retaining all their cosmological in- 
formation. 

Both of these steps are simple for maps with identical 
resolution and beam shape. For e.g. the 53 and 90 GHz 
COBE DMR maps (Bennett et al. 1996), comparing (1) 
was done by subtracting the maps and checking whether 
the difference was consistent with pure noise, whereas com- 
bining (2) was done by simply averaging the two maps, 
weighting pixels by their inverse variance. Unfortunately, 
both steps are usually more complicated. MAP, Planck 
and most current experiments have different angular res- 
olution in different channels. Many current experiments 
probe the sky in an even more complicated way, with e.g. 
double beams, triple beams, interferometric beams or com- 
plicated elongated software- modulated beams. Correlated 
noise further complicates the problem. 

Despite these difficulties, precision comparisons between 
different experiments are crucial. Some of the best evi- 
dence so far for detection of CMB fluctuations comes from 
the success of such comparisons in the past — between 
FIRS and DMR (Ganga et al. 1993), Tenerife and DMR 
(Lineweaver et al. 1995), MSAM and Saskatoon (Knox et 



al. 1998; hereafter K98), two years of Python data (Ruhl 
et al. 1995), three years of Saskatoon data (Tegmark et al. 
1996a) and two flights of MSAM (Inman et al. 1997). The 
fact that many of these non-COBE data sets were con- 
taminated by systematic errors made the success of these 
cross-checks even more encouraging. 

Data sets are currently growing rapidly in number, size 
and quality, often overlapping. It is therefore quite timely 
to develop methods that generalize both steps (1) and 
(2) to arbitrary experiments. This is the purpose of the 
present Letter. In the larger context of CMB data analysis, 
this is important between the steps of mapmaking (Wright 
et al. 1996; Wright 1996; Tegmark 1997a) and power spec- 
trum estimation (Tegmark 1997b; Bond et al. 1998) in the 
pipeline. 

2. NOTATION 

Let us first establish some notation that will be used 
throughout this paper. Consider a pixelized CMB sky 
map at some resolution consisting of to numbers xi, x m , 
where Xi is the temperature in the i th pixel. Suppose two 
experiments i = 1,2 have measured n, numbers y\, y ni , 
each probing some linear combination of the sky temper- 
atures Xi. Grouping these numbers into vectors x, yi and 
y2 of length to, ri\ and ri2, we can generally write 2 



yi=Aix + m, y 2 =A 2 x + n 2 



(1) 



for some known matrices A$ incorporating the beam 
shapes and some random noise vectors with zero mean 
((iii) = 0). We will refer to x as the "true sky" . yi and y 2 

1 The best way to address the problem is clearly to design CMB experiments to be more immune to systematic errors in the first place. The 
next best thing to do is carefully examine the raw time-ordered data from an experiment for specific forms of systematic errors that may be 
expected, and for general signs of systematic errors for data removal or correction. The cross-check between experiments that are discussed in 
this paper are by no means a substitute for this, but rather a way of catching additional systematic errors that have slipped through the cracks 
and not been detected by the team that reduced the raw data. 

2 This assumes both that the experimental data have perfect linearity, and that there are no non-zero experimental offsets. Jdeally, experi- 
mentalists should model and remove both nonlincarities and offsets as part of their data reduction, thereby making equation (|l|) applicable to 
their final data product. If offsets of an unknown amplitude remain nonetheless, multiplication oLa data set y = Ax + n + offsets by a matrix 
P that projects out these offsets will produce a new data set yi = Py that satisfies equation (hh, merely with the slightly more complicated 
noise vector ni = Pn and with Ai = PA. A detailed example of this procedure can be found in de Oliveira et al. (1998). 
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can be either time-ordered data or some linear combina- 
tion thereof, for instance pixelized maps. It is sometimes 
useful to define the larger matrices and vectors 



Ai 
A 2 



\ n 2 



which allows us to rewrite equation (|l|) as 
y = Ax + n. 

Let us write the noise covariance matrix as 



N 



Ni N12 



N 



12 



N 2 



(2) 



(3) 



(4) 



where Ni = (nin*), N 2 = (n 2 n 2 ) and Ni 2 = (nin 2 ). 

We derive useful consistency tests in §2 both for the spe- 
cial case of identical observations (Ai = A 2 ) and for the 
general case Ai ^ A 2 , then show how to combine data 
sets without destroying information in §3. 

3. COMPARING DATA SETS: ARE THEY CONSISTENT? 
3.1. The "null-buster" test 

Let us first consider the simple case where the two data 
sets measure the same thing, i.e., Ai = A 2 . This often 
applies for two different channels of the same experiment 
at the same frequency. We can then form a difference map 
z = xi — x 2 , which in the absence of systematic errors 
should consist of pure noise. 

Consider the null hypothesis Hq that such a data set z 
consists of pure noise, i.e., (z) = 0, (zz*) = N for some 
noise covariance matrix N. Suppose we have reason to 
suspect that the alternative hypothesis H\ is true, where 
(z) =0, (zz*) = N + S for some signal covariance matrix 
S, and want to try to rule out Hq by using a test statistic 
q that is a quadratic function of the data: 



q 



z'Ez* = tr{Ezz*}. 



(5) 



Depending on whether Hq or Hi is true, the mean of q will 
be (q) = trEN or (q)x = trEN + trES, respectively. If z 
has a multivariate Gaussian probability distribution 3 , then 
q will have a variance (Ag) 2 = 2trENEN if Hq is true. 
Therefore the quantity v = (q— (q) )/Aq gives the number 
of standard deviations ("sigmas") by which the observed 
q- value exceeds the mean expected under the null hypoth- 
esis. If we observe v ^S> 1, we can thus conclude that Hq is 
ruled out at high significance. Which choice of E has the 
greatest statistical power to reject H if Hi is true, i.e., 
which E maximizes the expectation value 



(g)i ~ (g)o 
Aq 



trES 



[2 tr ENEN] 



1/2 



(6) 



Since rescaling E by a constant leaves (v) invariant, let us 
for simplicity normalize E so that the denominator equals 
unity. We thus want to maximize tr ES subject to the 



constraint that trENEN = 1/2. Using the method of 
Lagrange multipliers with L = trES — AtrENEN/2 and 
differentiating L with respect to the components of E, this 
gives the solution E cx N _1 SN _1 . This leaves us with our 
optimal "null-buster" statistic 4 



'N^SISHz -trN^S 



[2tr {N-iSN-iS}] 



1/2 



(7) 



which will rule out the null hypothesis Ho with the largest 
average significance if Hi is true. For the special case 
S = N, we see that this reduces to a standard x 2_ t es t 



with v = (x ~ n)/y2n, x 



,2 — „tTVT-l 



z N z. Whenever we 



have reason to suspect systematic errors of a certain form 
(producing a signal oc S), the null-buster test will thus be 
more sensitive to these systematic errors than the x 2_ test, 
which is a general-purpose tool. This issue is elaborated 
in K98, which also provides a useful general discussion of 
consistency tests. 



How inconsistent is y { - r y 2 
with pure noise? 




0.1 1 10 

Relative normalization r 

Fig. 1. — The null-buster test against calibration errors was ap- 
plied to the QMAP experiment (see de Oliveira-Costa et al. 1998). 
The figure (from Devlin et al. 1998) shows the number of a at which 
signal is detected in the weighted flight 1 Ka-band difference maps 
yi — ry2. The x 2 -test is seen to be weaker. 



3.2. An example: calibration errors 

The null-buster is useful for comparing two CMB maps 
yi and y 2 that have the same shape, beam size and pix- 
elization. Let S denote the expected contribution to y^y/ 
from CMB fluctuations, i.e., S = AiCA^ = A 2 CA 2 , 
where C is the map covariance matrix 



c« = E 

i=i 



2Z + 1 

47T 



(8) 



3 We will only make the assumption of Gaussianity for the CMB and the detector noise, not for the systematic errors. In fact, systematics, 
such as foreground signals, data spikes and atmospherics seldom have a Gaussian probability distribution, but they vanish under the null 
hypothesis that we are trying to rule out. 

4 K98 discuss a test using the likelihood ratio, which is the best solution to a slightly differentj-problem. Translated into our notation, it 
corresponds to the choice E = N _1 — [N + S] — 1 = N _1 S[N + S] — . Note that whereas equation (pi) is independent of the normalization of S 
(the "shape" of the signal matters, but not its amplitude), the likelihood ratio test requires an assumed amplitude. 
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the unit vector r j gives the direction towards the i th pixel 
and Ci is the expected or observed CMB power spectrum 
(the normalization of Ci is irrelevant here — only the 
shape matters). Now suppose one or both of the data 
sets yi and y2 have a linear calibration error, i.e., are off 
by some constant multiplicative factors. Consider a dif- 
ference map of the form z = yi — ry2 for some factor 
r, and plot v as a function of r using equation (^) with 
the null hypothesis being that z is pure noise, i.e., that 
N = Ni + r[Ni2 + N* 2 ] + t- 2 N 2 . An example is shown in 
Figure 1 . If v 3> 1 for r = 1 , then we have a significant de- 
tection of signal not common to the two maps. If v(l) ^> 1 
but v{r) < 1 for some other r-value, this would show that 
there is a relative error of r in the normalization between 
the two maps. If the maps are at different frequencies, this 
could also indicate that they are dominated by foreground 
contamination with a frequency dependence different from 
the CMB. 

3.3. If the beam shape differs 

Above we found the null-buster to be a useful consis- 
tency test when applied to z = yi — ry2, since Ai = A 2 
implied that yi — y2 should consist of mere noise and be 
independent of the (a priori unknown) signal. If Ai ^ A2, 
this is no longer true, since the two data sets are not mea- 
suring the same thing. We therefore perform our null- 
buster test with the difference map redefined to be 



z = A 3 yi - rA 4 y 2 



(9) 



for some matrices A3 and A4, and choose these matri- 
ces so that the new data vectors A3yi and A 4 y2 measure 
at least approximately the same sky signal. Substituting 
equation ([I]) into equation (^) shows that this corresponds 
to the requirement 



A 3 Ai w A4A2, 



(10) 



which would make z approximately signal-independent for 
r = 1. In addition to equation (|l0|), we clearly want A3A1 
and A4A2 to have as large rankas possible, to avoid de- 
stroying more information than necessary (otherwise say 
A 3 = A4 = would do the trick). 

If one data set can be written as a linear combination of 
the other (say A2 = MAi for some matrix M), then the 
best choice is clearly A3 = M, A4 = I. This is the case for 
identically sampled maps where yi has higher resolution 
than ya, as well as for cases where a map (say DMR or 
QMAP) is compared with more complicated weighted av- 
erages (say Saskatoon) of the same sky region at the same 
or lower resolution. This approach was adopted by, e.g., 
Lincweaver et al. (1995), for comparing the (smoothed) 
Tenerife data to DMR. 

We will now tackle the problem for the general case. 
Our solution (there may be others) involves performing a 
signal-to-noise eigenmode analysis (Bond 1994; Bunn & 
Sugiyama 1995; Tegmark et al. 1996b) three times, in an 
unusual way. We will first present this procedure with no 
derivation, then show that it solves our problem. We start 
by solving the generalized eigenvalue problems 



[A1C1A1] Bi = NiBjAj., 
[A 2 C 2 A*] B 2 = N2B2A2, 



(11) 
(12) 



where the eigenvectors are the columns of the matrices Bi 
and B2, normalized so that B*NiBj = I, and the corre- 
sponding eigenvectors are elements of the diagonal matri- 
ces Ai and A 2 , sorted in decreasing order. We then reduce 
the width of Bi and B2 by throwing away all eigenvectors 
with eigenvalues below some cutoff \ m in, and define a new 
smaller data set 

*-(&)■ 



This will have the covariance matrix 



<yy 4 ) 



= S+N, where 



S = 
N = 



Ai 

BJjAzCA^Bi 



B t 1 A 1 CA*B 2 
A 2 



I B t 1 N 12 B 2 
B|N* 12 Bi I 



We then solve the generalized eigenvalue problem 
SB = NBA, 



(14) 
(15) 

(16) 



where the eigenvectors are the columns of the matrix B, 
normalized so that B'NB = I, and the corresponding 
eigenvectors are elements of the diagonal matrix A, sorted 
in increasing order. Finally, we reduce the width of B 
by throwing away all eigenvectors with eigenvalues above 
some cutoff X m ax , leaving us with a matrix of the form 



B 



B 3 
B 4 



(17) 



10- 



we ensure 



By choosing a tiny cutoff (say X max = 
that the transformed data vector B*y is completely dom- 
inated by detector noise, with a for all practical purposes 
negligible CMB signal. This means that 



B'y = [B 3 B* Ai + B^B* A 2 ]x a 0, 

so we can solve our original problem by defining 

A 3 = B*B 4 i, 
A 4 = — B4B0. 



(18) 



(19) 
(20) 



Why was the first eigenmode step necessary? We 
went through the extra trouble of solving equations (11) 
and (O) and applying the cutoff \ m in because otherwise, 
a number 

dominated for two different reasons 



(B'y)j in our final data vector could be noise- 



1 



from 



yi approxi- 



Because the signal contribution 
mately cancels that from y2. 
2. Because it is a noise-dominated mode from yi or y2 
(or some combination thereof). 

It is clearly only the first case that interests us when com- 
paring data sets. In the latter case, applying the null- 
buster only tests for systematic errors internally, within 
each data set, and this is best done before comparing it 
with other data sets. We therefore throw away all noise- 
dominated modes from the individual data sets, choosing 
say Xmin = lj before proceeding to the final eigenvalue 
problem (|l6|). A lower threshold may be appropriate as 
well — as long as we choose X m in 3> X max , we know that 
the lack of signal in B*y will be due mainly to subtracting 
the data sets. 
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4. COMBINING DATA SETS 

4.1. Combining maps of different beam shape 

Suppose that we have performed all the tests described 
above and conclude that the data sets yi and y2 are con- 
sistent. We then wish to simplify future calculations by 
combining the two data sets into a single map x, invert- 
ing the (usually over-determined) system of linear equa- 
tions (||). A physically different but mathematically iden- 
tical problem was solved in Tegmark (1997a), showing that 
the minimum-variance choice 



N2V2I 



This 



x = [A'N^AJ^A'N-V 



(21) 



retains all the cosmological information of the original data 
sets. Substituting this into equation (|3|) shows that the 
combined map is unbiased ((x) = x) and that the pixel 
noise £ = x — x has the covariancc matrix 



£ = (ee*) = [A'N^A]- 



(22) 



A common special case is that where the two data sets have 
uncorrelated noise (N12 = 0), simplifying the solution to 

x = S [A^NrVi + A|N2 x y 2 ] , (23) 
S = [A' 1 Nr 1 A 1 + A 2 N 2 - 1 A 2 ]" 1 . (24) 

An even simpler case occurs if the first data set is already 
a sky map (say the QMAP map), so that Ai = I, and we 
wish to combine it with a more complicated data set cover- 
ing the same sky region (say the Saskatoon observations). 
For this case, equation ( |23| ) can be rewritten as 



x = yi 



£A*N 2 - 



(y2 - A 2 yi), 



(25) 



which has a simple interpretation. The vector A 2 yi is just 
map 1 convolved with the observing strategy of experiment 
2, so the factor (y 2 — A 2 yi) contains only noise. The map 
x is thus obtained by correcting yi with a pure noise term 
that partially cancels some of its noisiest modes. 

This important case applies also to combining two maps 
with different angular resolution. For instance, if yi and 
y 2 have narrow Gaussian beams of resolution 8\ and 82, 
with 62 > 8±, then we define x to be the true sky map 
smoothed by 8 and have (A 2 )y = exp[-8^ /28 2 ]/2tt9 2 
where 0y = cos -1 (?; -r j) is the angular separation between 
pixels i and j and 8 = (8\ — 9 2 ) 1 / 2 is the extra smoothing 
in map 2. Despite occasional claims to the contrary, this 
shows that two maps at different resolution can be com- 
bined without destroying any information, without first 
degrading the higher resolution down to the lower one by 
smoothing. Instead, equation (|2|) will use the lower res- 
olution map y2 to improve the accuracy of the large-scale 
fluctuations in yi (as was done by Schlegel et al. 1998 
when combining the DIRBE and IRAS maps), retaining 
all the information from both maps. 

Note that even the case of two identical maps (Ai = 
A 2 = I) can be non-trivial. Equation (E3) shows that the 



optimal combination is x = X [N7/ 1 yi 
was used in the combined QMAP analysis (de Oliveira- 
Costa et al. 1998), and reduces to separate averaging for 
each pixel only if the two maps have vanishing or identical 
noise correlations. 



4.2. Combining maps at different frequencies to remove 
foregrounds 

As described in Tegmark (1998) and further elaborated 
in White (1998), foregrounds can be treated as simply an 
additional source of noise that is correlated between chan- 
nels (N12 7^ 0). This means that if we have d data sets 
measured at different frequencies v a , a = 1,2, ...,d, each 
defined by their own matrix A Q , the best 5 way to combine 
them is still given by equation (|2l| ) — we simply need to 
include more physics in the noise covariance matrix N. 
Let us be more explicit about this. The noise covariance 
matrix N will be of size n x n, where n = J^ a =i n ai i-e., 
the total number of numbers in all data sets combined. 
We will therefore write the elements of N as N a j^ , where 
a and (3 determine the data sets and i and j the numbers 
therein. If there are / foreground components, this noise 
matrix becomes a sum N = £Lo N( fc ), where is the 
contribution from instrumental noise and the other terms 
are the contributions from foregrounds (synchrotron emis- 
sion, bremsstrahlung, dust, point sources, etc.). Each of 
these foreground matrices will be of the form 



N 



(fc) 

aif3j 



FW[A Q C«A*] 



t - 



(26) 



where the d x d matrix F( fe ) gives the covariance of the 
i th foreground between frequencies, and the m x m ma- 
trix C( fc ) gives its spatial covariance between pixels. For 



example, if the data sets are identical maps (A Q 
d = 2 different frequencies, we obtain the (2m) 
block matrix 



N 



(fc) 

aif3j 



= F«8C<*>=( F l C( l 



F^CW F 9 ( 9 fe) C( fe ) 



I) at 

(2m) 



(27) 
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Explicit models specifying the foreground dependence on 
frequency and position (the matrices F and C) can be 
found in Tegmark (1998). Although it has been common to 
characterize C by a power spectrum as in equation (^) , the 
above formalism clearly works even if we break the isotropy 
by introducing an additional dependence on galactic lati- 
tude, say. 

This shows how to best combine data sets at different 
frequencies. When combining different multifrequency ex- 
periments, it is desirable to first combine corresponding 
maps at the same frequency and apply the null-buster test 
for systematic errors. The different frequencies can then 
be merged afterwards, as a second step. 



5 Specifically, under the assumption that foregrounds and detector noise have multivariate Gaussian probability distributions, one one can 
show (Tegmark 1997) that this foreground removal method retains all the cosmological information present in the multifrequency set of input 
maps, i.e., constitutes an information-theoretically lossless form of data compression reducing all data down to a single CMB map. One does 
not need to assume that the CMB itself is Gaussian. Systematics such as foreground signals, data spikes, and atmospherics seldom have a 
Gaussian probability distribution — in this more general case, the removal method is no longer strictly lossless, but retains the feature that it 
minimizes the total rms of foregrounds and noise assuming only their finite second moment. 
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5. CONCLUSIONS 

We have derived a set of useful tools for comparing and 
combining CMB data sets, all based on simple matrix op- 
erations, and drawn the following conclusions: 

1. The "null-buster" test is better at detecting system- 
atic errors than a simple x 2 -test. 

2. Such a consistency test can be performed even be- 
tween two experiments with quite different beam 
shape and observing strategy. 

3. When combining two maps of different angular res- 
olution, one need not smooth the higher resolution 
down to the lower one. 

4. When combining two identical maps, one should gen- 
erally not do the averaging separately for each pixel. 

5. Our map merging method also handles the case of 
e.g. Planck, where the beams are elliptical rather 
than round and the different detectors have different 
beam orientations. 



6. The foreground removal method of Tegmark (1998) 
is a special case of the combination technique we 
derived, and can be carried out even for foreground 
models involving frequency dependence, scale depen- 
dence, latitude dependence and spectral index vari- 
ations in combination. 

As the available CMB data sets continue to increase in 
quantity and quality, it will be useful to perform such 
cross-checks against systematic errors and incrementally 
combine all consistent data sets into a single state-of-thc 
art map containing our entire knowledge of the CMB sky. 6 
Both theories and new observations can then be tested 
against this combined map as it gradually grows in size, 
quality and resolution. 

The author wishes to thank Angelica de Oliveira-Costa 
and Martin White for helpful comments. Support for this 
work was provided by NASA though grant NAG5-6034 
and Hubble Fellowship HF-01084.01-96A from STScl, op- 
erated by AURA, Inc. under NASA contract NAS5-26555. 
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°The signal and noise covariance matrices S and N of such a map could of course be rather complicated. However, the same complication 
is encountered if one maintains separate experiment-specific maps and only attempts to combine the power spectrum measurements. This 
is because the sample variance of the band power measurements from different experiments are generally not independent, and computation 
of the correlation requires knowledge of these matrices. A disadvantage of combining only at the power spectrum level is that this generally 
sacrifices useful information on relative phases, whereas the merged map retains all the cosmological information from both data sets. For 
instance, near degeneracies in the noise covariance matrices of two maps can often be broken by combining them. 



